betanbinom (Beta-Negative Binomial)#

The beta-negative binomial distribution models a count (discrete) outcome:

the number of failures you see before achieving n successes, when the success probability p is unknown and is modeled as a Beta(a,b) random variable.

It is a natural choice when you want a negative binomial model, but you also want to account for heterogeneity / uncertainty in the success probability.


Learning goals#

By the end you should be able to:

  • write the PMF and recognize it as a Beta mixture of a negative binomial

  • derive the mean and variance using the law of total expectation/variance

  • implement a NumPy-only sampler via the hierarchical definition

  • visualize the PMF/CDF and check Monte Carlo simulations

  • use scipy.stats.betanbinom (and fit parameters via a custom MLE routine)

import numpy as np
import plotly.express as px
import plotly.graph_objects as go
import os
import plotly.io as pio

from scipy.optimize import minimize
from scipy.special import betaln, gammaln, hyp2f1
from scipy.stats import betanbinom

pio.templates.default = "plotly_white"
pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook")
np.set_printoptions(precision=6, suppress=True)

rng = np.random.default_rng(42)

1) Title & Classification#

  • Name: betanbinom (Beta-negative binomial)

  • Type: Discrete distribution

  • Support (SciPy standardized form):

    • (k \in {0,1,2,\dots})

    • with shift: (X = K + \text{loc})

  • Parameter space (SciPy):

    • (n \ge 0) (often an integer “number of successes”, but the PMF can be extended via Gamma functions)

    • (a > 0), (b > 0) (Beta shape parameters)

Moment existence (important for interpretation):

  • (\mathbb{E}[X]) exists if (a > 1)

  • (\mathrm{Var}(X)) exists if (a > 2)

  • higher moments require larger (a) (heavy tail when (a) is near 1)

2) Intuition & Motivation#

What it models#

You run Bernoulli trials (success/failure). You stop once you have seen n successes. Let (X) be the number of failures observed before the (n)-th success.

If the success probability (p) were known and fixed, (X) would follow a negative binomial distribution.

In many real settings, (p) varies across people, items, time periods, or is simply uncertain. A Beta distribution is a standard way to encode that uncertainty:

  • (p \sim \mathrm{Beta}(a,b))

  • (X \mid p \sim \mathrm{NegBin}(n, p)) (failures before (n) successes)

Marginalizing out (p) gives betanbinom.

Typical real-world use cases#

  • Conversion funnels / retries: number of failed attempts before achieving n successes (signups, purchases), when conversion rate varies by user.

  • Reliability / quality control: how many failed tests occur before n passes, with heterogeneous pass rates.

  • Overdispersed counts: compared to a plain negative binomial, the extra randomness in (p) produces heavier tails.

Relations to other distributions#

  • Negative binomial: recovered when the Beta distribution concentrates at a fixed (p) (e.g., (a+b\to\infty) with (a/(a+b)=p)).

  • Beta-geometric: when (n=1), you count failures before the first success.

  • Beta-binomial: finite-trial analogue (random (p), then a Binomial count).

3) Formal Definition#

Hierarchical (mixture) definition#

[ \begin{aligned} P &\sim \mathrm{Beta}(a,b), \ X \mid P=p &\sim \mathrm{NegBin}(n,p), \qquad X \in {0,1,2,\dots}. \end{aligned} ]

Here (X) counts failures before (n) successes, so the conditional PMF is

[ \Pr(X=k\mid p) = \binom{n+k-1}{k} (1-p)^k p^n, \qquad k\ge 0. ]

PMF#

After integrating out (p), the (standardized) PMF is

[ \Pr(X=k) = \binom{n+k-1}{k} \frac{B(a+n,, b+k)}{B(a,b)}, \qquad k=0,1,2,\dots ]

where (B(\cdot,\cdot)) is the Beta function:

[ B(x,y) = \int_0^1 t^{x-1}(1-t)^{y-1},dt = \frac{\Gamma(x)\Gamma(y)}{\Gamma(x+y)}. ]

CDF#

Because the support is discrete/infinite, a standard definition is

[ F(k) = \Pr(X\le k) = \sum_{j=0}^{\lfloor k \rfloor} \Pr(X=j). ]

In practice, we compute this sum numerically or use SciPy’s cdf/sf.

def logpmf_betanbinom(k, n, a, b):
    # Stable log-PMF using log-gamma and log-beta.
    # PMF: C(n+k-1, k) * B(a+n, b+k) / B(a,b),  k=0,1,2,...
    k = np.asarray(k)
    if np.any(k < 0):
        out = np.full_like(k, fill_value=-np.inf, dtype=float)
        out[k < 0] = -np.inf
        return out

    k_int = k.astype(int)
    if np.any(k_int != k):
        raise ValueError("k must be integer-valued for the discrete PMF.")

    # log binomial coefficient with Gamma functions
    log_choose = gammaln(n + k_int) - gammaln(n) - gammaln(k_int + 1)

    return log_choose + betaln(a + n, b + k_int) - betaln(a, b)


def pmf_betanbinom(k, n, a, b):
    return np.exp(logpmf_betanbinom(k, n, a, b))


def cdf_betanbinom(k, n, a, b):
    k = np.asarray(k)
    k_floor = np.floor(k).astype(int)
    out = np.zeros_like(k_floor, dtype=float)

    for idx, kk in np.ndenumerate(k_floor):
        if kk < 0:
            out[idx] = 0.0
            continue
        ks = np.arange(0, kk + 1)
        out[idx] = pmf_betanbinom(ks, n, a, b).sum()

    return out


def support_for_plotting(rv, q=0.999, fallback_max=2000):
    # Choose a reasonable finite support grid for plotting.
    try:
        k_max = rv.ppf(q)
    except Exception:
        k_max = np.nan

    try:
        k_max = np.asarray(k_max).item()
    except Exception:
        pass

    if np.isfinite(k_max):
        k_max = int(k_max)
        return np.arange(0, max(k_max, 20) + 1)

    # Fallback: mean + a few std devs (if finite)
    try:
        mean, var = rv.stats(moments="mv")
    except Exception:
        mean, var = np.nan, np.nan

    if np.isfinite(mean) and np.isfinite(var) and var >= 0:
        k_max = int(np.ceil(mean + 8 * np.sqrt(var)))
        k_max = min(k_max, fallback_max)
        return np.arange(0, max(k_max, 20) + 1)

    # Last resort: fixed cap
    return np.arange(0, 200 + 1)
# Quick correctness check vs SciPy
n, a, b = 8, 4.5, 2.0
rv = betanbinom(n, a, b)
ks = np.arange(0, 80)

max_abs_err = np.max(np.abs(pmf_betanbinom(ks, n, a, b) - rv.pmf(ks)))
max_abs_err
1.942890293094024e-16

4) Moments & Properties#

A useful general result: factorial moments#

For discrete counts, falling factorial moments often simplify:

[ (X)_m = X(X-1)\cdots(X-m+1). ]

For betanbinom, using the Poisson–Gamma representation of the negative binomial and then mixing over (p), one obtains (for (a>m)):

[ \mathbb{E}[(X)m] = (n){m}; \frac{B(a-m,, b+m)}{B(a,b)}, ]

where ((n)_m = \frac{\Gamma(n+m)}{\Gamma(n)}) is the rising Pochhammer symbol.

From factorial moments you can recover raw moments using Stirling numbers of the second kind.

Mean and variance (closed form)#

Provided (a>2):

[ \mathbb{E}[X] = \frac{n,b}{a-1}, \qquad \mathrm{Var}(X) = \frac{n,b,(a+b-1),(n+a-1)}{(a-1)^2,(a-2)}. ]

Skewness and kurtosis#

Closed forms exist but are algebraically messy; a clean route is:

  1. compute (\mathbb{E}[(X)_m]) for (m=1,2,3,4)

  2. convert to (\mathbb{E}[X^r]) for (r\le 4)

  3. convert to central moments and then to skewness/kurtosis

We’ll do this in code (and verify against SciPy’s stats).

PGF / MGF / characteristic function#

A probability generating function (for (|z|<1)) is

[ G(z)=\mathbb{E}[z^X] = (1-z)^{-n},\frac{B(a+n,b)}{B(a,b)}; {}_2F_1!\left(n, a+n; a+b+n; -\frac{z}{1-z}\right). ]

The MGF is (M(t)=G(e^t)), which typically exists only for (t<0) because the distribution can have heavy tails.

Entropy#

There is no simple closed-form expression in general; numerically:

[ H(X) = -\sum_{k=0}^{\infty} \Pr(X=k),\log \Pr(X=k), ]

approximated by truncating the sum where the remaining tail probability is negligible.

# Factorial moments and conversion to mean/var/skew/kurt

def factorial_moment(m, n, a, b):
    # E[(X)_m] for m=0,1,2,... (requires a>m)
    if m == 0:
        return 1.0
    if a <= m:
        return np.nan

    poch = np.exp(gammaln(n + m) - gammaln(n))  # (n)_m rising
    return poch * np.exp(betaln(a - m, b + m) - betaln(a, b))


def raw_moments_up_to_4(n, a, b):
    f1 = factorial_moment(1, n, a, b)
    f2 = factorial_moment(2, n, a, b)
    f3 = factorial_moment(3, n, a, b)
    f4 = factorial_moment(4, n, a, b)

    # x^r = sum_{j=0}^r S(r,j) (x)_j, for r<=4
    e1 = f1
    e2 = f2 + f1
    e3 = f3 + 3 * f2 + f1
    e4 = f4 + 6 * f3 + 7 * f2 + f1
    return e1, e2, e3, e4


def mean_var_skew_kurt(n, a, b):
    mu1, m2, m3, m4 = raw_moments_up_to_4(n, a, b)

    var = m2 - mu1**2
    mu3 = m3 - 3 * mu1 * m2 + 2 * mu1**3
    mu4 = m4 - 4 * mu1 * m3 + 6 * mu1**2 * m2 - 3 * mu1**4

    skew = mu3 / (var ** 1.5)
    ex_kurt = mu4 / (var ** 2) - 3
    return mu1, var, skew, ex_kurt


n, a, b = 8, 4.5, 2.0
mu, var, skew, ex_kurt = mean_var_skew_kurt(n, a, b)
mu, var, skew, ex_kurt
(4.571428571428572, 33.044897959183615, 4.84601064484599, 117.57126976284624)
# Compare to SciPy
n, a, b = 8, 4.5, 2.0
rv = betanbinom(n, a, b)
scipy_mu, scipy_var, scipy_skew, scipy_ex_kurt = rv.stats(moments="mvsk")

np.array([mu, var, skew, ex_kurt]), np.array([scipy_mu, scipy_var, scipy_skew, scipy_ex_kurt])
(array([  4.571429,  33.044898,   4.846011, 117.57127 ]),
 array([  4.571429,  33.044898,   4.846011, 117.57127 ]))

Entropy (numerical) and generating functions#

A good way to build intuition is to check that:

  • a truncated entropy sum agrees with SciPy’s entropy() (up to truncation error)

  • the closed-form PGF matches the definition (G(z)=\sum_k z^k\Pr(X=k)) for (|z|<1)

from math import exp


def pgf_betanbinom(z, n, a, b):
    # Probability generating function G(z)=E[z^X] for |z|<1.
    z = np.asarray(z)

    # log prefactor: (1-z)^(-n) * B(a+n,b)/B(a,b)
    log_pref = -n * np.log(1 - z) + (betaln(a + n, b) - betaln(a, b))
    x = -z / (1 - z)

    return np.exp(log_pref) * hyp2f1(n, a + n, a + b + n, x)


n, a, b = 10, 6.0, 3.0
rv = betanbinom(n, a, b)

# Entropy: SciPy vs truncation
ks_H = support_for_plotting(rv, q=0.9999)
pmf_H = rv.pmf(ks_H)
H_trunc = -np.sum(pmf_H * np.log(pmf_H + 1e-300))
tail_mass = 1.0 - rv.cdf(ks_H[-1])

H_scipy = rv.entropy()

print(f"Entropy (SciPy)    : {H_scipy:.6f}")
print(f"Entropy (truncated): {H_trunc:.6f}")
print(f"Truncation tail mass ~ {tail_mass:.2e}")

# PGF sanity check
z = 0.4
ks = support_for_plotting(rv, q=0.9999)
pgf_sum = np.sum(rv.pmf(ks) * (z ** ks))
pgf_closed = pgf_betanbinom(z, n, a, b)

print(f"PGF via sum      : {pgf_sum:.12f}")
print(f"PGF closed-form  : {pgf_closed:.12f}")
print(f"abs diff         : {abs(pgf_sum - pgf_closed):.2e}")

# MGF exists at least for t<0 (since z=e^t<1)
t = -0.25
print(f"MGF(t) with t={t}: {pgf_betanbinom(exp(t), n, a, b):.6f}")
Entropy (SciPy)    : 2.828474
Entropy (truncated): 2.827167
Truncation tail mass ~ 9.90e-05
PGF via sum      : 0.142197143995
PGF closed-form  : 0.142197143995
abs diff         : 5.55e-17
MGF(t) with t=-0.25: 0.384537

5) Parameter Interpretation#

n (number of successes)#

  • In the “stop after n successes” story, n is how many successes you require.

  • Larger n typically increases the scale (more opportunities to accumulate failures).

a, b (Beta prior on the success probability)#

For (P\sim\mathrm{Beta}(a,b)):

  • (\mathbb{E}[P] = \frac{a}{a+b}) (prior mean success probability)

  • (a+b) controls concentration (how variable (P) is)

Interpretation:

  • Large a (relative to b) pushes (P) toward 1 → fewer failures.

  • Large b pushes (P) toward 0 → more failures and heavier tail.

  • Small a+b means (P) varies a lot → betanbinom is more overdispersed than a plain negative binomial.

def plot_pmf(params_list, q=0.995, title="PMF comparison"):
    fig = go.Figure()
    max_k = 0
    for (n, a, b, name) in params_list:
        rv = betanbinom(n, a, b)
        ks = support_for_plotting(rv, q=q)
        max_k = max(max_k, ks.max())
        fig.add_trace(go.Bar(x=ks, y=rv.pmf(ks), name=name, opacity=0.6))

    fig.update_layout(
        title=title,
        xaxis_title="k (failures before n successes)",
        yaxis_title="PMF",
        barmode="overlay",
        width=850,
        height=450,
    )
    fig.update_xaxes(range=[0, min(max_k, 120)])
    return fig


params = [
    (10, 2.0, 2.0, "n=10, a=2, b=2 (mean p=0.5, high var)"),
    (10, 20.0, 20.0, "n=10, a=20, b=20 (mean p=0.5, low var)"),
    (10, 2.0, 8.0, "n=10, a=2, b=8 (mean p=0.2)"),
]
plot_pmf(params, title="How (a,b) shapes the PMF").show()
params_n = [
    (1, 6.0, 3.0, "n=1"),
    (5, 6.0, 3.0, "n=5"),
    (20, 6.0, 3.0, "n=20"),
]
plot_pmf(params_n, title="Effect of n (Beta prior fixed)").show()

6) Derivations#

6.1 Expectation#

Using the hierarchical model and the law of total expectation:

[ \mathbb{E}[X] = \mathbb{E}[,\mathbb{E}[X\mid P],]. ]

For (X\mid P=p\sim\mathrm{NegBin}(n,p)) (failures before (n) successes):

[ \mathbb{E}[X\mid p] = \frac{n(1-p)}{p}. ]

So

[ \mathbb{E}[X] = n,\mathbb{E}!\left[\frac{1-p}{p}\right] = n,(\mathbb{E}[p^{-1}] - 1). ]

For (P\sim\mathrm{Beta}(a,b)), (\mathbb{E}[p^{-1}]) exists only if (a>1), and

[ \mathbb{E}[p^{-1}] = \frac{B(a-1,b)}{B(a,b)} = \frac{a+b-1}{a-1}. ]

Therefore

[ \boxed{\mathbb{E}[X] = \frac{n,b}{a-1}} \quad (a>1). ]


6.2 Variance#

Law of total variance:

[ \mathrm{Var}(X) = \mathbb{E}[\mathrm{Var}(X\mid P)] + \mathrm{Var}(\mathbb{E}[X\mid P]). ]

For the negative binomial (failures before (n) successes):

[ \mathrm{Var}(X\mid p) = \frac{n(1-p)}{p^2}. ]

Both terms require (\mathbb{E}[p^{-2}]), which exists only if (a>2). After algebra you get:

[ \boxed{\mathrm{Var}(X) = \frac{n,b,(a+b-1),(n+a-1)}{(a-1)^2,(a-2)}} \quad (a>2). ]


6.3 Likelihood (for data)#

Given i.i.d. observations (k_1,\dots,k_m) from betanbinom(n,a,b), the likelihood is

[ L(n,a,b) = \prod_{i=1}^m \binom{n+k_i-1}{k_i},\frac{B(a+n, b+k_i)}{B(a,b)}. ]

A numerically stable log-likelihood uses gammaln and betaln (log-Beta).

In many applications n is known (you decide how many successes to wait for), and you fit a,b.

# Verify mean/variance by Monte Carlo
n, a, b = 10, 6.0, 3.0
rv = betanbinom(n, a, b)

mu_theory, var_theory = rv.stats(moments="mv")

samples = rv.rvs(size=200_000, random_state=rng)

mu_mc = samples.mean()
var_mc = samples.var(ddof=0)

(mu_theory, var_theory), (mu_mc, var_mc)
((6.0, 36.0), (6.03206, 36.4657421564))

7) Sampling & Simulation (NumPy-only)#

Algorithm (hierarchical sampling)#

The definition already gives a sampler:

  1. Sample (p\sim\mathrm{Beta}(a,b))

  2. Sample (X\mid p\sim\mathrm{NegBin}(n,p)) (failures before n successes)

This is efficient and easy to vectorize.

Below is a NumPy-only implementation.

def rvs_betanbinom_numpy(n, a, b, size, rng=None):
    # NumPy-only sampler using the hierarchical definition.
    rng = np.random.default_rng() if rng is None else rng

    p = rng.beta(a, b, size=size)
    # NumPy's negative_binomial returns failures before n successes (success prob = p)
    x = rng.negative_binomial(n, p)
    return x


n, a, b = 10, 6.0, 3.0
x_np = rvs_betanbinom_numpy(n, a, b, size=10_000, rng=rng)
x_np[:10], x_np.mean()
(array([8, 1, 3, 3, 0, 5, 2, 6, 6, 8]), 6.0321)

8) Visualization#

We’ll plot:

  • PMF (exact)

  • CDF (exact)

  • Monte Carlo histogram compared to the PMF

n, a, b = 10, 6.0, 3.0
rv = betanbinom(n, a, b)
ks = support_for_plotting(rv, q=0.995)

pmf = rv.pmf(ks)
cdf = rv.cdf(ks)

fig_pmf = go.Figure()
fig_pmf.add_trace(go.Bar(x=ks, y=pmf, name="PMF"))
fig_pmf.update_layout(
    title="betanbinom PMF",
    xaxis_title="k",
    yaxis_title="P(X=k)",
    width=850,
    height=420,
)
fig_pmf.show()

fig_cdf = go.Figure()
fig_cdf.add_trace(go.Scatter(x=ks, y=cdf, mode="lines+markers", name="CDF"))
fig_cdf.update_layout(
    title="betanbinom CDF",
    xaxis_title="k",
    yaxis_title="P(X ≤ k)",
    width=850,
    height=420,
)
fig_cdf.show()

# Monte Carlo vs PMF
samples = rv.rvs(size=50_000, random_state=rng)

fig_mc = px.histogram(
    samples,
    nbins=int(ks.max() + 1),
    histnorm="probability",
    title="Monte Carlo histogram vs exact PMF",
)
fig_mc.update_layout(xaxis_title="k", yaxis_title="empirical probability", width=850, height=420)
fig_mc.add_trace(go.Scatter(x=ks, y=pmf, mode="markers", name="exact PMF"))
fig_mc.show()

9) SciPy Integration#

SciPy provides scipy.stats.betanbinom as a discrete distribution.

Common methods:

  • pmf(k, n, a, b) / logpmf(...)

  • cdf(k, n, a, b) / sf(...)

  • rvs(n, a, b, size=..., random_state=...)

  • stats(..., moments='mvsk'), entropy(...)

About fit#

As of SciPy 1.15.0, rv_discrete distributions like betanbinom do not provide a .fit() method.

A standard replacement is to implement maximum likelihood yourself using scipy.optimize. Below we fit a,b assuming n is known.

# Basic SciPy usage
n, a, b = 10, 6.0, 3.0
rv = betanbinom(n, a, b)

rv.pmf([0, 1, 2, 3]), rv.cdf([0, 1, 2, 3])
(array([0.068627, 0.108359, 0.119195, 0.113519]),
 array([0.068627, 0.176987, 0.296182, 0.409701]))
def neg_loglik_ab(log_ab, data, n, eps=1e-12):
    # Negative log-likelihood for a,b (n fixed), parameterized in log-space.
    log_a, log_b = log_ab
    a = np.exp(log_a)
    b = np.exp(log_b)

    if not np.isfinite(a) or not np.isfinite(b) or a <= eps or b <= eps:
        return np.inf

    data = np.asarray(data)
    if np.any(data < 0) or np.any(np.floor(data) != data):
        return np.inf

    ll = logpmf_betanbinom(data, n, a, b).sum()
    return -ll


def fit_ab_mle(data, n, a0=2.0, b0=2.0):
    x0 = np.log([a0, b0])
    res = minimize(neg_loglik_ab, x0=x0, args=(data, n), method="Nelder-Mead")
    a_hat, b_hat = np.exp(res.x)
    return a_hat, b_hat, res


# Simulate data from known parameters, then fit a,b
n_true, a_true, b_true = 10, 6.0, 3.0
rv_true = betanbinom(n_true, a_true, b_true)

data = rv_true.rvs(size=2_000, random_state=rng)

a_hat, b_hat, res = fit_ab_mle(data, n=n_true, a0=3.0, b0=3.0)
(a_true, b_true), (a_hat, b_hat), res.success
((6.0, 3.0), (6.2164460599996625, 3.083330357835461), True)

10) Statistical Use Cases#

10.1 Hypothesis testing / anomaly detection#

If you have a baseline betanbinom(n,a,b) model for failures-before-successes, you can test whether an observed count (k_\text{obs}) is unusually large:

[ \text{p-value} = \Pr(X \ge k_\text{obs}) = \mathrm{sf}(k_\text{obs}-1). ]

This is a one-sided “too many failures” test.

10.2 Bayesian modeling (posterior + posterior predictive)#

If (P\sim\mathrm{Beta}(a,b)) and you observe n successes and k failures (i.e., you stopped at the (n)-th success), then

[ P \mid (k,n) \sim \mathrm{Beta}(a+n, b+k). ]

The posterior predictive distribution for future failures before n_future successes is again beta-negative binomial:

[ X_\text{future} \mid (k,n) \sim \mathrm{betanbinom}(n_\text{future},; a+n,; b+k). ]

10.3 Generative modeling#

In a hierarchical generative model, you might sample (p_i) per individual/group from a Beta distribution, then generate counts via a negative binomial. The resulting marginal distribution across individuals is betanbinom.

# 10.1 Hypothesis testing example: is k_obs unusually large?

n, a, b = 10, 6.0, 3.0
rv = betanbinom(n, a, b)

k_obs = 40
p_value = rv.sf(k_obs - 1)  # P(X >= k_obs)

k_obs, p_value
(40, 0.0027753438349855664)
# 10.2 Bayesian update + posterior predictive example

# Prior on p
a0, b0 = 6.0, 3.0

# Observe: stop after n successes, saw k failures
n_obs = 10
k_obs = 40

# Posterior on p
a_post = a0 + n_obs
b_post = b0 + k_obs

# Posterior mean of p
p_post_mean = a_post / (a_post + b_post)

# Posterior predictive: failures before n_future successes
n_future = 10
rv_pred = betanbinom(n_future, a_post, b_post)

ks = support_for_plotting(rv_pred, q=0.99)

fig = go.Figure()
fig.add_trace(go.Bar(x=ks, y=rv_pred.pmf(ks), name="posterior predictive PMF"))
fig.update_layout(
    title=f"Posterior predictive (n_future={n_future}) after observing k={k_obs} failures, n={n_obs} successes",
    xaxis_title="future failures",
    yaxis_title="probability",
    width=900,
    height=420,
)
fig.show()

p_post_mean
0.2711864406779661

11) Pitfalls#

  • Moment conditions: mean requires (a>1), variance requires (a>2), etc. If a is near 1 the tail can be extremely heavy.

  • Parameter interpretation: n is often an integer in the “wait for n successes” story; extending n to non-integers is mathematically possible via Gamma functions but may not match your data-generating process.

  • Numerical stability: computing (\binom{n+k-1}{k}) directly will overflow; use gammaln/betaln (log-space).

  • Infinite support: CDF/entropy computations require truncation; use sf for tail probabilities and choose cutoffs via quantiles.

12) Summary#

  • betanbinom is a Beta mixture of a negative binomial (failures before n successes).

  • It is useful when the success probability is uncertain or heterogeneous, producing overdispersion.

  • The PMF has a compact Beta-function form, and mean/variance are available in closed form (with conditions on a).

  • Sampling is straightforward via the hierarchical model using NumPy-only random generators.

  • SciPy implements the distribution as scipy.stats.betanbinom (but you fit parameters via custom MLE routines).

References#

  • SciPy docstring: scipy.stats.betanbinom

  • Wikipedia: “Beta negative binomial distribution”

import numpy as np
import scipy
import plotly

print("numpy ", np.__version__)
print("scipy ", scipy.__version__)
print("plotly", plotly.__version__)
numpy  1.26.2
scipy  1.15.0
plotly 6.5.2